Estimating small frequency moments of data stream: a characteristic function approach

نویسندگان

  • Sumit Ganguly
  • Purushottam Kar
چکیده

We consider the problem of estimating the first moment of a data stream defined as F1 = ∑ i∈{1,2,...,n}∣fi∣ to within 1± -relative error with high probability. Several algorithms are wellknown for this problem including the median estimator over p-stable sketches by Indyk [11], the geometric means estimator over p-stable sketches by Li [13] and the Hss sketch based algorithm in [8]. The current best algorithm is given by Kane, Nelson and Woodruff in [12] that uses space O( −2 log(mM)) and is proved to be space-optimal. In this paper, we present a novel, space-optimal algorithm for estimating Fp with an elementary analysis that is based on the characteristic function of stable distributions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating Frequency Moments of Streams

We will develop algorithms that can approximate Fk by making one pass of the stream and using a small amount of memory o(n+m). Frequency moments have a number of applications. F0 represents the number of distinct elements in the streams (which the FM-sketch from last class estimates using O(log n) space. F1 is the number of elements in the stream m. F2 is used in database optimization engines t...

متن کامل

Estimating Frequency Moments of Streams

We will develop algorithms that can approximate Fk by making one pass of the stream and using a small amount of memory o(n+m). Frequency moments have a number of applications. F0 represents the number of distinct elements in the streams (which the FM-sketch from last class estimates using O(log n) space. F1 is the number of elements in the stream m. F2 is used in database optimization engines t...

متن کامل

Instructor : Chandra Chekuri Scribe : Chandra Chekuri 1 Estimating Frequency Moments in Streams

A significant fraction of streaming literature is on the problem of estimating frequency moments. Let σ = a1, a2, . . . , am be a stream of numbers where for each i, ai is an intger between 1 and n. We will try to stick to the notation of using m for the length of the stream and n for range of the integers1. Let fi be the number of occurences (or frequency) of integer i in the stream. We let f ...

متن کامل

Measuring technological gap ratio of wheat production using StoNED approach to metafrontier

The aim of this paper is to use the concept of the metafrontier function to study the determination of efficiency differentials and Technological Gap Ratio (TGR) on wheat production in Khorasan Razavi province. In this study, we used the metafrontier function and group frontier based on the concept of Stochastic Nonparametric Envelopment of Data analysis (StoNED). The data used in this stud...

متن کامل

Better Bounds for Frequency Moments in Random-Order Streams

Estimating frequency moments of data streams is a very well studied problem [1–3,9,12] and tight bounds are known on the amount of space that is necessary and sufficient when the stream is adversarially ordered. Recently, motivated by various practical considerations and applications in learning and statistics, there has been growing interest into studying streams that are randomly ordered [3,4...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1005.1122  شماره 

صفحات  -

تاریخ انتشار 2010